Fault-Detection and Isolation Algorithms for Health Monitoring of Electronics Subjected to Shock and Vibration
نویسندگان
چکیده
Failures in electronics subjected to shock and vibration are typically diagnosed using the built-in self test (BIST) or using continuity monitoring of daisy-chained packages. The BIST which is extensively used for diagnostics or identification of failure, is focused on reactive failure detection and provides limited insight into reliability and residual life. In this paper, a new technique has been developed for health monitoring and failure mode classification based on measured damage pre-cursors. A feature extraction technique in the joint-time frequency domain has been developed along with pattern classifiers for fault diagnosis of electronics at product-level. The Karhunen Loéve transform (KLT) has been used for feature reduction and de-correlation of the feature vectors for fault mode classification in electronic assemblies. Euclidean, and Mahalanobis, and Bayesian distance classifiers based on joint-time frequency analysis, have been used for classification of the resulting feature space. Previously, the authors have developed damage pre-cursors based on time and spectral techniques for health monitoring of electronics without reliance on continuity data from daisy-chained packages. Statistical Pattern Recognition techniques based on wavelet packet energy decomposition [Lall 2006] have been studied by authors for quantification of shock damage in electronic assemblies, and auto-regressive moving average, and timefrequency techniques have been investigated for system identification, condition monitoring, and fault detection and diagnosis in electronic systems [Lall 2008]. However, identification of specific failure modes was not possible. In this paper, various fault modes such as solder inter-connect failure, inter-connect missing, chip delamination chip cracking etc in various packaging architectures have been classified using clustering of feature vectors based on the KLT approach [Goumas 2002]. The KLT de-correlates the feature space and identifies dominant directions to describe the space, eliminating directions that encode little useful information about the features [Qian 1996, Schalkoff 1972, Theodoridis 1998, Tou 1974]. The clustered damage pre-cursors have been correlated with underlying damage. Several chip-scale packages have been studied, with leadfree second-level interconnects including SAC105, SAC305 alloys. Transient strain has been measured during the drop-event using digital image correlation and high-speed cameras operating at 100,000 fps. Continuity has been monitored simultaneously for failure identification. Fault-mode classification has been done using KLT and joint-timefrequency analysis of the experimental data. In addition, explicit finite element models have been developed and various kinds of failure modes have been simulated such as solder ball cracking, trace fracture, package falloff and solder ball failure. Models using cohesive elements present at the solder joint-copper pad interface at both the PCB and package side have also been created to study the traction-separation behavior of solder. Fault modes predicted by simulation based pre-cursors have been correlated with those from experimental data. INTRODUCTION Currently in electronic packaging, BIST, fuses and canaries are extensively used for failure detection in electronics. These forms of failure detection procedures in IC’s ensure high level of product functionality. The goal is of monitoring electronics is a trade off between the effectiveness and cost/time involved in the process of design/manufacturing and maintenance. BIST has several advantages which provide reduction of cost and time. Proceedings of the SEM Annual Conference June 1-4, 2009 Albuquerque New Mexico USA ©2009 Society for Experimental Mechanics Inc. For example BIST reduces dependence on ATE (Automatic Test Equipment) which reduces the effect of current in the design. BIST is also effective in many ways. It provides speed in system testing of Circuit Under Test (CUT) [Hamzaoglu 2000]. It also overcomes the limitation of pins in the packaging and utilizes the extra area available on the chip there by more information about faults in obtained. BIST is used in several forms such as On-LineBIST and Off-Line-BIST. On line BIST is mostly used for electrical monitoring of the chips or functionality of the IC. On line BIST is used for monitoring whether the circuit is behaving correctly or not. For detecting and monitoring of actual physical damage in the circuit, off line BIST is used. Structural faults in the circuit are mainly due to external loads experienced by the packages in manufacturing and field operations. Ideally a BIST should have high fault coverage and low overheads on its circuit design. But there always exists a trade-off which leads to compromising the effectiveness of the BIST. One of the major concerns for a packaging and design engineer is the size of the BIST logic. Fault coverage and overheads are directly driven by size of the BIST. 100% fault coverage will also lead to increase in the overheads involved in the design and implementation of the BIST. Hence application of BIST is always a compromise of cost and effectiveness. Fuses and Canaries are also used for detecting and controlling faults in electronic system. Fuses [Anderson 2004] are used to sense any abnormality such as surge and fluctuations in voltage and temperature limits in the system and restore normal operating conditions. Canaries are special devices which are mounted on the standard device which is being monitored. Canaries have accelerated form of same failure mechanism as that of standard device on which it is mounted. Hence they fail faster. This property of canaries is used for measuring the actual time to failure [Mishra 2002] of the standard device. Canaries are used for identification of physics of failure in electronic packages. They are used for studying low cycle fatigue (solder joint fatigue) [Anderson 2004], corrosion, changes and sudden exposure to temperature and vibration transients. One of the major challenges in use of fuses, canaries is that they need frequent replacements and repairs. Hence it is difficult to integrate them with the main system. The health monitoring techniques discussed above in electronic packaging have limited scope. They are all based on reactive failure and hence they are primarily diagnostic in nature. They do not give any information on remaining residual life, how and when the damage starts initiating, what is trend of damage progression, what kind of failure mode is dominant in the electronic system. These questions are better tackled and answered by techniques which are predictive in nature. Previously authors have developed techniques driven by statistical pattern recognition for structural health monitoring of electronics. These studies quantified damage initiation and progression [Lall 2006, 2007, 2008] in electronics subjected to drop and shock. Statistical Pattern Recognition techniques based on wavelet packet energy decomposition [Lall 2006] have been studied by authors for quantification of shock damage in electronic assemblies, and auto-regressive moving average, and timefrequency techniques have been investigated for system identification, condition monitoring, and fault detection and diagnosis in electronic systems [Lall 2008]. Leading indicators for system level damage in portable electronics are developed based on wavelet packet energy decomposition [Lall 2006], joint time frequency analysis [Lall 2007] and auto-regressive moving average, and time-frequency techniques have been investigated for system identification, condition monitoring, and fault detection and diagnosis in electronic systems [Lall 2008]. Currently damage quantification is based on electrical continuity, which limits visibility into damage initiation, progression. Damage pre-cursors based on time and spectral techniques for health monitoring of electronics do not rely on continuity data from daisy-chained packages. This technique of structural health monitoring involves extensive off-line processing of data obtained from the sensors strategically placed at various target points. Techniques such as digital image correlation [Lall 2006] are also used for data acquisition of global and local responses of the electronic system subjected to drop and shock. Structural health monitoring, i.e. assessing the current state of the system and establishing a knowledge data based on predictions about the system state is previously used in various engineering fields such as delamination in composites [Saravanos 1994], damage detection in aerospace structures [Robinson 1996, Doebling 1995], off shore structures [Brinker 1995]. These methods of structural health monitoring have found application in performace assessment of machinery systems [Lee 1995, Chuang 2004]. Structural health monitoring by statistical pattern recognition has been previously applied in other scientific disciplines such as biology [Christodoulou 1999], psychology [Dellaert 1996], medicine [Holt 1998], marketing [Apte 1997], Artificial Intelligence [Kohonen 1988], computer vision [Low 1990] etc. SHM of electronics by statistical pattern recognition is relatively new. In this paper various dominant failure modes occurring with every drop and shock event in electronics are classified using pre-failure feature space. Damage due to drop and shock in electronic packages can have a wide variety of failure modes occurring at various competing locations in different packaging architectures. The damage is due to overstresses developed by repetitive loading occurring with each drop event. Previously, the authors have developed damage pre-cursors based on time and spectral techniques for health monitoring and damage detection in electronics without reliance on continuity data from daisy-chained packages. This paper focuses on classification of failure modes based on leading indicators in pre-failure space. The methodology developed in this work is based on de-correlation of joint time frequency feature space by Karhunen Loéve transform (KLT). Various fault modes such as solder inter-connect failure, inter-connect missing, chip delamination chip cracking etc are classified. These fault modes are found as most frequently occurring in electronic packages subjected to drop and shock. Fault mode classification for assessing system level damage of electronics subjected to drop and shock is relatively new. TEST VEHICLES Two test vehicles have been used to study classification of failure mechanisms and modes in electronics under shock-impact loading. The test vehicles have been labeled as test board A and B. The test vehicle-A is a multilayer FR4 printed circuit board with four 1156 I/O FPGAs. The packages are fully functional field programmable gate arrays (FPGA) and not daisy-chained devices. The FPGA test board has been connected to a LabView data-acquisition system through a NI 6541 Digital generator and CB 2162 connector board to collect the digital data. (a) (b) Figure 1: Test Vehicle-A Packages, 1156 FPGA (a) 35 x 35 mm with 34 x 34 solder array. (b) Interconnect Array Configuration. The set-up enables each FPGA to write a square-pulse across the solder interconnects, charge an external capacitor and read the square-pulse back from the second pair of second-level interconnects for the package being tested. Figure 1a shows the location of FPGA solder interconnects being tested for detection of damage initiation and propagation. The solder interconnects being tested have been strategically selected based on the location of failure under thermo-mechanical loads and shock, vibration loads. Area-array packages often fail in the die-shadow area under thermo-mechanical loads, while the corner interconnects fail under shock and vibration loads. Figure 1b shows the interconnect array configuration of the 1156 I/O FPGA Package. The package is 35 x 35 mm in size, and has a full-array of solder interconnects in a 34 x 34 array configuration at 1 mm pitch. The packages have Sn3Ag0.5Cu solder interconnects Figure 2: Test Vehicle-A Printed Circuit Assembly
منابع مشابه
Online Fault Detection and Isolation Method Based on Belief Rule Base for Industrial Gas Turbines
Real time and accurate fault detection has attracted an increasing attention with a growing demand for higher operational efficiency and safety of industrial gas turbines as complex engineering systems. Current methods based on condition monitoring data have drawbacks in using both expert knowledge and quantitative information for detecting faults. On account of this reason, this paper proposes...
متن کاملVery High Frequency Monitoring System for Engine Gearbox and Generator Health Management
In cooperation with the major propulsion engine manufacturers, the authors are developing and demonstrating a unique very high frequency (VHF) vibration monitoring system that integrates various vibroacoustic data with intelligent feature extraction and fault isolation algorithms to effectively assess engine gearbox and generator health. The system is capable of reporting on the early detection...
متن کاملBearing Fault Detection in Induction Motor Using Fast Fourier Transform
ABSTACT: In the present scenario every industry need Condition Based Monitoring System to avoid unwanted faults in the process components. Vibration condition monitoring technique is widely used for fault detection. Vibration monitoring is the most reliable method of assessing the overall health of a motor system. In this paper we work on 2 Hp inductions motor. Ball bearing fault is widely occu...
متن کاملSubspace-based fault detection algorithms for vibration monitoring
We address the problem of detecting faults modeled as changes in the eigenstructure of a linear dynamical system. This problem is of primary interest for structural vibration monitoring. The purpose of the paper is to describe and analyze new fault detection algorithms, based on recent stochastic subspace-based identiication methods and the statistical local approach to the design of detection ...
متن کاملApplication of Thau Observer for Fault Detection of Micro Parallel Plate Capacitor Subjected to Nonlinear Electrostatic Force
This paper investigates the fault detection of a micro parallel plate capacitor subjected to nonlinear electrostatic force. For this end Thau observer, which has good ability in fault detection of nonlinear system has been presented and governing nonlinear dynamic equation of the capacitor has been presented. Upper and lower threshold for fault detection have been obtained. The robustness of th...
متن کامل